ACM Digital Library 



Page 1 of 1 



0 PORTAL 



US Patent & Trademark Office 



Subscribe (Full Service) Register (Free, Limited Service) Login 
I Search: ® The ACM Digital Library O The Guide 



[speech synthesis facial animation^ 



warn' 



Full text of every article ever published by ACM. 
• Using the ACM Digital Library 

• Frequently Asked Questions (FAQ's) 

Recently loaded issues and proceedings: 

(available in the DL within the past 2 weeks) 

Journal of Experimental Algorithmics (JEA) 
Volume 9 

ACM Transactions on Computer-Human Interaction 
(TOCHI) 
Volume 11 Issue 2 

ACM Transactions on Database Systems (TODS) 
Volume 29 Issue 2 

ACM Transactions on Software Engineering and 
Methodology (TOSEM) 

J* Feedback 

• Report a problem 

• Take our Satisfaction survey 



Join ACM 



Subscribe to 
Publications 



I Join SIGs 



InslilulEons & 
i Libraries 



• Advanced Search 

• Browse the Digital Library: 

• Journals 

• Magazines 

• Transactions 

• Proceedings 

• Newsletters 

• Publications by Affiliated 
Organizations 

• Special Interest Groups (SIGs) 

Personalized Services: Login reguired 
^ Mv Binders 

Save search results and queries. Share 
binders with colleagues and build 
bibliographies. 

IP TOC Service 

Receive the table of contents via email as 
new issues or proceedings become 
available. 



CrossRef Search 
Pilot program to create 
full-text interpublisher 
searchability. 




COMPUTINO 

Reviews 



Access critical reviews 
of computing literature. 



THE GUIDE TO COMPUTING LITERATURE 



Bibliographic collection from major 
publishers in computing. 
Go to The Guide 



The ACM Portal is published by the Association for Computing Machinery. Copyright © 2004 ACM, Inc. 
Terms of Usage Privacy Policy Code of Ethics Contact Us 



h c g cf 



Results (page 1): speech synthesis facial animation 



Page 1 of 6 



0 PORTAL 



US Patent & Trademark Office 



Subscribe (Full Service) Register (Limited Service, Free) Login 

Search: ® The ACM Digital Library O The Guide 
| speech synthe sis facial animation 



Terms used speech synthesis facial animation 



I* Feedback Report a problem Satisfaction 
survey 

Found 4,320 of 138,663 



Sort results 
by 

Display 
results 



[relevance "] | fesave results to a Binder 



■ , ^ Search Tips 

[expanded form [g □ Open results i 



Try an Advanced Search 

Try this search in The ACM Guide 



in a new 



window 



Results 1 - 20 of 200 
Best 200 shown 



Result page: 123456Z8910 next 



Relevance scale 



1 Automated lip-synch and speech synthesis for character animation 
J. P. Lewis, F. I. Parke 

May 1986 ACM SIGCHI Bulletin , Proceedings of the SIGCHI/GI conference on Human 
factors in computing systems and graphics interface, volume 17 issue si 

Additional Information: full citation , abstract , references , citings , index 
terms 



Full text available: || odf(757.78 KB) 



An automated method of synchronizing facial animation to recorded speech is described. In 
this method, a common speech synthesis method (linear prediction) is adapted to provide 
simple and accurate phoneme recognition. The recognized phonemes are then associated 
with mouth positions to provide keyframes for computer animation of speech using a 
parametric model of the human face. The linear prediction software, once implemented, can 
also be used for speech resynthesis. The synthes ... 

2 Trainable videorealistic speech animation 
Tony Ezzat, Gadi Geiger, Tomaso Poggio 

July 2002 ACM Transactions on Graphics (TOG) , Proceedings of the 29th annual 

conference on Computer graphics and interactive techniques, Volume 21 issue 3 

Additional Information: full citation , abstract , references , citings , index 
terms 



Full text available: |§ pdf(524.89 KB) 



We describe how to create with machine learning techniques a generative, speech 
animation module. A human subject is first recorded using a videocamera as he/she utters 
a predetermined speech corpus. After processing the corpus automatically, a visual speech 
module is learned from the data that is capable of synthesizing the human subject's mouth 
uttering entirely novel utterances that were not recorded in the original video. The 
synthesized utterance is re-composited onto a background sequence ... 

Keywords: facial animation, facial modeling, lip synchronization, morphing, optical flow, 
speech synthesis 



Posters and Short Papers: An integrated framework for face modeling, facial motion 

analysis and synthesis 

Pengyu Hong, Zhen Wen, Thomas Huang 

October 2001 Proceedings of the ninth ACM international conference on Multimedia 

Full text available: ^ pdf(2.37 MB) Additional Information: full citation , abstract , references , index terms 
This paper presents an integrated framework for face modeling, facial motion analysis and 
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synthesis. This framework systematically addresses three closely related research issues: 
(1) selecting a quantitative visual representation for face modeling and face animation; (2) 
automatic facial motion analysis based on the same visual representation; and (3) speech 
to facial coarticulation modeling. The framework provides a guideline for methodically 
building a face modeling and animation system. The ... 

Keywords: face animation, face modeling, facial motion analysis, iFACE, speech to facial 
coarticulation modeling 



BEAT: the Behavior Expression Animation Toolkit 
Justine Cassell, Hannes Hogni Vilhjalmsson, Timothy Bickmore 

August 2001 Proceedings of the 28th annual conference on Computer graphics and 

interactive techniques 

>— ii • i -, ui « .^coo^n. Additional Information: full citation , abstract , references , citings , index 

Full text available: TO pdfd 58.86 KB) ~~ ~— ^ 

*" terms 

The Behavior Expression Animation Toolkit (BEAT) allows animators to input typed text that 
they wish to be spoken by an animated human figure, and to obtain as output appropriate 
and synchronized nonverbal behaviors and synthesized speech in a form that can be sent to 
a number of different animation systems. The nonverbal behaviors are assigned on the 
basis of actual linguistic and contextual analysis of the typed text, relying on rules derived 
from extensive research into human conversationa ... 

Keywords: animation systems, facial animation, gesture, speech synthesis 



Posters & demos: Speech driven facial animation 

P. Kakumanu, R. Gutierrez-Osuna, A. Esposito, R. Bryll, A. Goshtasby, O. N. Garcia 
November 2001 Proceedings of the 2001 workshop on Percetive user interfaces 

Full text available:^ pdf(880.00 KB) Additional Information: full citation , abstract , references , index terms 

The results reported in this article are an integral part of a larger project aimed at achieving 
perceptually realistic animations, including the individualized nuances, of three-dimensional 
human faces driven by speech. The audiovisual system that has been developed for 
learning the spatio-temporal relationship between speech acoustics and facial animation is 
described, including video and speech processing, pattern analysis, and MPEG-4 compliant 
facial animation for a given speaker. In particu ... 

Keywords: MPEG-4, computer vision, facial animation, lip-syncing, speech processing 



Facial animation & hair: Geometry-driven photorealistic facial expression synthesis 
Qingshan Zhang, Zicheng Liu, Baining Guo, Harry Shum 

July 2003 Proceedings of the 2003 ACM SIGGRAPH/Eurographics Symposium on 

Computer Animation 

<- .. * ^ •. ui a ,x/ C n oo kad\ Additional Information: full citation , abstract, references , citings , index 

Full text available: TO pdf(62.32 MB) - 

" terms 

Expression mapping (also called performance driven animation) has been a popular method 
to generate facial animations. One shortcoming of this method is that it does not generate 
expression details such as the wrinkles due to the skin deformation. In this paper, we 
provide a solution to this problem. We have developed a geometry-driven facial expression 
synthesis system. Given the feature point positions (geometry) of a facial expression, our 
system automatically synthesizes the corresponding ex ... 

Speech dialogue with facial displays: multimodal human-computer conversation 
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Katashi Nagao, Akikazu Takeuchi 

June 1994 Proceedings of the 32nd conference on Association for Computational 

Linguistics 

Full text available: 111 pdf(865.08 KB) 

JsT Additional Information: full citation , abstract , references 

II P Publisher Site 

Human face-to-face conversation is an ideal model for human-computer dialogue. One of 
the major features of face-to-face communication is its multiplicity of communication 
channels that act on multiple modalities. To realize a natural multimodal dialogue, it is 
necessary to study how humans perceive information and determine the information to 
which humans are sensitive. A face is an independent communication channel that conveys 
emotional and conversational signals, encoded as facial expression ... 

Video Rewrite: driving visual speech with audio 
Christoph Bregler, Michele Covell, Malcolm Slaney 

August 1997 Proceedings of the 24th annual conference on Computer graphics and 
interactive techniques 

Full text available: *g pdf(1 79.44 KB) Additional Information: full citation , references , citings , index terms 



Keywords: facial animation, lip sync 



9 Facial animation framework for the web and mobile platforms 
Igor S. Pandzic 

February 2002 Proceeding of the seventh international conference on 3D Web 
technology 

f— ii • • , . . a _,x /rt n* *A i/nv Additional Information: full citation , abstract, references , citings , index 

Full text available: W\ pdf(906.61 KB) * 

yes ^ terms 

Talking virtual characters are graphical simulations of real or imaginary persons capable of 
human-like behavior, most importantly talking and gesturing. They may find applications on 
the Internet and mobile platforms as newscasters, customer service representatives, sales 
representatives, guides etc. After briefly discussing the possible applications and the 
technical requirements for bringing such applications to life, we describe our approach to 
enable these applications: the Facial Animation ... 

Keywords: FBA, MPEG-4, VRML, facial animation, facial motion cloning, talking head, 
virtual characters, virtual humans, visual text-to-speech 



10 MPEG-4: an object-based multimedia coding standard supporting mobile applications 
Atul Puri, Alexandras Eleftheriadis 

June 1998 Mobile Networks and Applications, Volume 3 issue l 

Additional Information: full citation , abstract , references , citings , index 



Full text available: -- 

terms , review 

The ISO MPEG committee, after successful completion of the MPEG-1 and the MPEG-2 
standards is currently working on MPEG-4, the third MPEG standard. Originally, MPEG-4 was 
conceived to be a standard for coding of limited complexity audio-visual scenes at very low 
bit-rates; however, in July 1994, its scope was expanded to include coding of scenes as a 
collection of individual audio-visual objects and enabling a range of advanced functionalities 
not supported by other standards. One of the ke ... 

11 Intelligent animated agents for interactive language training I 
Ron Cole, Tim Carmell, Pam Connors, Mike Macon, Johan Wouters, Jacques de Villiers, Alice 
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Tarachow, Dominic Massaro, Michael Cohen, Jonas Beskow, Jie Yang, Uwe Meier, Alex Waibel, 
Pat Stone, Alice Davis, Chris Soland, George Fortier 

June 1998 ACM SIGCAPH Computers and the Physically Handicapped, issue 61 
Full text available: ^ pdf(441 .05 KB) Additional Information: full citation , abstract , index terms 

This report describes a three-year project, now eight months old, to develop interactive 
learning tools for language training with profoundly deaf children. The tools combine four 
key technologies: speech recognition, developed at the Oregon Graduate Institute; speech 
synthesis, developed at the University of Edinburgh and modified at OGI; facial animation, 
developed at University of California, Santa Cruz; and face tracking and speech reading, 
developed at Carnegie Mellon University. These tech ... 

12 Communicative facial displays as a new conversational modality 
Akikazu Takeuchi, Katashi Nagao 

May 1993 Proceedings of the SIGCHI conference on Human factors in computing 
systems 

r- u, 0 M4M% m^ Additional Information: full citation , abstract, references , citings, index 
Full text available: ^pdf(1.03 MB) farms 

The human face is an independent communication channel that conveys emotional and 
conversational signals encoded as facial displays. Facial displays can be viewed as 
communicative signals that help coordinate conversation. We are attempting to introduce 
facial displays into computer-human interaction as a new modality. This will make the 
interaction tighter and more efficient while lessening the cognitive load. As the first step, a 
speech dialogue system was selected to investigate the powe ... 

Keywords: anthropomorphism, conversational interfaces, facial expression, multimodal 
interfaces, user interface design 



13 Reception and posters: Model-based talking face synthesis for anthropomorphic 
spoken dialog agent system 

Tatsuo Yotsukura, Shigeo Morishima, Satoshi Nakamura 

November 2003 Proceedings of the eleventh ACM international conference on 
Multimedia 

Full text available: ^ pdf(1.34MB) Additional Information: full citation , abstract , references , index terms 

Towards natural human-machine communication, interface technologies by way of speech 
and image information have been intensively developed. An anthropomorphic dialog agent 
is an ideal system, which integrates spoken dialog and natural facial expressions. This 
paper reports on our project aiming to create a general-purpose toolkit for building an 
easily customizable anthropomorphic agent. There have been almost no tools so far such as 
intuitive, easy to understand, fully interactive, and open sou ... 

Keywords: anthropomorphic dialog agent, face image synthesis, facial animation, lip 
synchronization 



14 Speech and gaze: A computer-animated tutor for spoken and written language learning J 
Dominic W. Massaro 

November 2003 Proceedings of the 5th international conference on Multimodal 
interfaces 

Full text available: ^pdf(481.38 KB) Additional Information: full citation, abstract , references , index terms 

Baldi, a computer-animated talking head is introduced. The quality of his visible speech has 
been repeatedly modified and evaluated to accurately simulate naturally talking humans. 
Baldi's visible speech can be appropriately aligned with either synthesized or natural 
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auditory speech. Baldi has had great success in teaching vocabulary and grammar to 
children with language challenges and training speech distinctions to children with hearing 
loss and to adults learning a new language. We demonstrat ... 

Keywords: facial and speech synthesis, language learning 



15 Facial animation (panel): past, present and future 

Demetri Terzopoulos, Barbara Mones-Hattal, Beth Hofer, Frederic Parke, Doug Sweetland, 
Keith Waters 

August 1997 Proceedings of the 24th annual conference on Computer graphics and 
interactive techniques 

Full text available: MB) Additional Information: full citation , references , citings 



16 Fuzzy input coding for an artificial neural— network modelling visual speech 

movements 
Hans-Heinrich Bothe 

February 1995 Proceedings of the 1995 ACM symposium on Applied computing 

Full text available: ^ pdf(406.28 KB) Additional Information: full citation , references , index terms 



Keywords: Kohonen map, certificial visual speech (AVS), fuzzy input coding, lip-reading, 
radical basis function network 



17 Heads, faces, hair: Head shop: generating animated head models with anatomical 
structure 

Kolja Kahler, Jorg Haber, Hitoshi Yamauchi, Hans-Peter Seidel 

July 2002 Proceedings of the 2002 ACM SIGGRAPH/Eurographics symposium on 
Computer animation 

^ .. . ^ -. u. 0 *>-r Mm Additional Information: full citation , abstract , references , citings , index 
Full text available: TO pdf(9.67 MB) 

terms 

We present a versatile construction and deformation method for head models with 
anatomical structure, suitable for real-time physics-based facial animation. The model is 
equipped with landmark data on skin and skull, which allows us to deform the head in 
anthropometrically meaningful ways. On any deformed model, the underlying muscle and 
bone structure is adapted as well, such that the model remains completely animatable 
using the same muscle contraction parameters. We employ this general techni ... 

Keywords: biological modeling, deformations, facial animation, geometric modeling, 
morphing, physically based animation 



18 An automatic lip-synchronization algorithm for synthetic faces 
K. Waters, T. Levergood 

October 1994 Proceedings of the second ACM international conference on Multimedia 

- ,. . ^ a A* i/nx Additional Information: full citation , abstract , references , citings, index 

Full text available: 1 39 pdf(787.45 KB) 

terms 

This paper addresses the problem of automatically synchronizing computer-generated faces 
with synthetic speech. The complete process provides a novel form of face-to-face 
communication and the ability to create a new range of talking personable synthetic 
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characters. Based on plain ASCII text input, a synthetic speech segment is generated and 
synchronized in real-time to a graphical display of an articulating mouth and face. The key 
component of the algorithm is the run-time ... 

19 The virtual human as a multimodal interface 
Daniel Thalmann 

May 2000 Proceedings of the working conference on Advanced visual interfaces 

Full text available: ^j| pdf(1.85 MB) Additional Information: full citation , abstract , references , index terms 

This paper discusses the main issues for creating Interactive Virtual Environments with 
Virtual Humans emphasizing the following aspects: creation of Virtual Humans, gestures, 
interaction with objects, multimodal communication. 

Keywords: action recognition, gestures, multimodal communication, virtual humans 



20 Performance-driven hand-drawn animation 

Ian Buck, Adam Finkelstein, Charles Jacobs, Allison Klein, David H. Salesin, Joshua Seims, 
Richard Szeliski, Kentaro Toyama 

June 2000 Proceedings of the 1st international symposium on Non-photorealistic 
animation and rendering 

Full text available: ^ pdfd.82 MB) Additional Information: full citation , references , citings , index terms 



Keywords: animation, face tracking, image morphing, non-photorealistic rendering 
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